Business Problem Summary:
As a data scientist at a growing startup, one of the challenges is establishing a fair and consistent pay scale across all roles while considering employees' years of experience. Currently, each staff member negotiates their salary independently, and the company lacks both a structured pay scale and defined job levels.
Currently, there's no structured pay system in place, and all employees negotiated their salaries individually, leading to lot of discrepancies and biases in offers to new hires.
The aim is to introduce a transparent pay scale that categorizes roles based on years of experience, ensuring that new hires are not unfairly compensated compared to existing employees.
Business Problem Details:
Creating a fair salary framework:: The objective is to create a fair and transparent pay scale that categorizes roles based on years of experience, ensuring consistency and equality in compensation across the organization.
Introduction of Job Levels: By categorizing roles into job levels based on years of experience (e.g., 0-2 years as Analyst, 2-4 years as Associate, etc.), the organization aims to provide a structured framework for salary determination.
To arrive at a fair pay scale, salary data of current staff and their years of experiences are used
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv(r"C:\Users\Teni\Desktop\Datasets May-April\Salary_dataset.csv")
df.head()
YearsExperience | Salary | |
---|---|---|
0 | 1.2 | 39344 |
1 | 1.4 | 46206 |
2 | 1.6 | 37732 |
3 | 2.1 | 43526 |
4 | 2.3 | 39892 |
df.describe()
YearsExperience | Salary | |
---|---|---|
count | 30.000000 | 30.000000 |
mean | 5.413333 | 76004.000000 |
std | 2.837888 | 27414.429785 |
min | 1.200000 | 37732.000000 |
25% | 3.300000 | 56721.750000 |
50% | 4.800000 | 65238.000000 |
75% | 7.800000 | 100545.750000 |
max | 10.600000 | 122392.000000 |
sns.regplot(data=df, x= 'YearsExperience', y = 'Salary')
# at experience 0, the data shows the salary is just below 40000 (start pay)
<AxesSubplot:xlabel='YearsExperience', ylabel='Salary'>
- Much as the pay has always been individually negotiated, there's a linear reationship between the years of experience and the salaries
X = df['YearsExperience']
y = df['Salary']
Because the dataset has only one Feature, the Simple Linear Regression will be used.
# y = mx+b
np.polyfit(X, y, deg=1)
# the starting salary an 0-year expereince is #24848
array([ 9449.96232146, 24848.20396652])
pro_exp = np.linspace(1,10, 30)
pro_exp
array([ 1. , 1.31034483, 1.62068966, 1.93103448, 2.24137931, 2.55172414, 2.86206897, 3.17241379, 3.48275862, 3.79310345, 4.10344828, 4.4137931 , 4.72413793, 5.03448276, 5.34482759, 5.65517241, 5.96551724, 6.27586207, 6.5862069 , 6.89655172, 7.20689655, 7.51724138, 7.82758621, 8.13793103, 8.44827586, 8.75862069, 9.06896552, 9.37931034, 9.68965517, 10. ])
pred_sal = 9449.96232146*pro_exp + 24848.20396652
pred_sal
array([ 34298.16628798, 37230.91321533, 40163.66014268, 43096.40707003, 46029.15399738, 48961.90092473, 51894.64785208, 54827.39477943, 57760.14170678, 60692.88863413, 63625.63556148, 66558.38248883, 69491.12941618, 72423.87634353, 75356.62327088, 78289.37019822, 81222.11712557, 84154.86405292, 87087.61098027, 90020.35790762, 92953.10483497, 95885.85176232, 98818.59868967, 101751.34561702, 104684.09254437, 107616.83947172, 110549.58639907, 113482.33332642, 116415.08025377, 119347.82718112])
sns.scatterplot(data=df, x='YearsExperience', y='Salary')
plt.plot(pro_exp, pred_sal, color='red')
[<matplotlib.lines.Line2D at 0x2097bd9d700>]
The HR team approached me, saying we have 2 new cabdidates whose years of experiences are 10 and 5; asking how much the offer should be
exp = 10
pred_sal = 9449.96232146*exp + 24848.20396652
print("For a 10-year work experience, the salary offer should be: $", round(pred_sal, 2))
For a 10_year work experience, the salary offer should be: $ 119347.83
exp=5
pred_sal = 9449.96232146*exp + 24848.20396652
print("For a 5-year work experience, the salary offer should be: $", round(pred_sal, 2))
For a 5-year work experience, the salary offer should be: $ 72098.02
By aligning roles with these well-defined job levels, the organization provides a transparent roadmap for career advancement. By using the equation (y = m*(years of experience) + b), managers are able to negotiate fairly when hiring new talents- without ofshooting the payscale within the organization.
Fair and Transparent Pay Scale: The proposed salary framework introduces a fair and transparent pay scale by using a polyfit equation derived from the array ([9449.96232146, 24848.20396652]). This equation, represented as y = mx + b, determines salaries (y) based on years of experience (x). By relying on this mathematical approach, the organization eliminates potential biases and ensures consistency in compensation across all roles and levels.
Well-Defined Career Progression: The framework establishes clear job levels, enabling structured career growth and progression within the organization. These job levels are: